[pull] main from triggerdotdev:main#114
Merged
Merged
Conversation
… queues (#3558) ## Summary Queues that use concurrency keys can no longer bypass the per-queue length cap, and the "Queued | Running" columns in the dashboard now show the true total across all CK variants instead of 0. The cap and the dashboard both relied on `ZCARD` of the base queue key, but CK-keyed runs live under `<base>:ck:<variant>` keys. Any queue that used concurrency keys read 0 — letting a single CK variant grow unbounded past the user's configured cap. ## Fix Two per-base-queue counters are maintained inside the CK Lua scripts: `<base>:lengthCounter` and `<base>:runningCounter`. Non-CK enqueue/dequeue paths are untouched. Counters are lazy-initialized the first time a CK enqueue (or nack) lands on a queue: the Lua script sums `ZCARD` across the variants tracked by `ckIndex`, sets the counter, then `INCR`s. Pre-existing CK backlog on already-populated queues is captured automatically — no batch migration required. `INCR`/`DECR` is gated on `ZADD`/`SADD` returning 1 (a new entry vs an idempotent no-op), so duplicate enqueues or re-dequeues don't inflate the counter. The counter is `SET` with a 24-hour TTL on init. `INCR`/`DECR` do not extend the TTL, so the counter expires daily and the next CK operation re-seeds it from `ckIndex`. This bounds any drift that accumulates during the rolling-deploy overlap window — where old (un-Tracked) and new (Tracked) webapp instances briefly coexist — to ≤24 hours, with no admin sweep or background reconciler needed. Read paths pipeline `ZCARD`/`SCARD` on the base key + `GET` on the counter and sum. A missing counter is treated as 0, so pure non-CK queues see the same answer as before. The counter-aware scripts ship alongside the originals with a `Tracked` suffix for rolling-deploy safety; a follow-up PR will drop the originals once this has rolled out. ## Test plan - [ ] `pnpm run test --filter @internal/run-engine` — 116 tests pass, including a new `ckCounters.test.ts` covering lazy init from pre-existing backlog, churn, floor-at-zero, the non-CK regression case, mixed CK + non-CK on the same base queue, idempotent re-enqueue (ZADD-already-exists), 24h TTL on the counter, and nack re-seeding after counter expiry. - [ ] Verified end-to-end against a live local environment: - Triggered 24 CK enqueues across 4 variants → `lengthCounter=16`, `runningCounter=8`, dashboard showed Queued=16 / Running=8 for the CK queue. - Set the env queue cap to 16, triggered 12 more enqueues → 8 succeeded, 4 rejected with `QueueSizeLimitExceededError`. - Deleted the counter on a queue with 31 messages already sitting in CK variants, triggered one more enqueue → counter materialized to 31 from the `ckIndex` sum, then INCR'd.
## Summary Local ClickHouse was burning ~325% CPU endlessly merging its own telemetry tables (`metric_log`, `asynchronous_metric_log`, `part_log`, `trace_log`) after the container had been running long enough to accumulate hundreds of GB of system-log data. OrbStack Helper reflected this on the host (~400% CPU). These tables are not used by anything in the dev stack. They only exist for ClickHouse to log itself, so disabling them eliminates the merge churn entirely. ## Changes - Adds `docker/config/clickhouse-disable-system-logs.xml`, mounted into `/etc/clickhouse-server/config.d/`, that removes the noisy system log tables via `<table remove="1"/>`. - Mounts the override file in `docker/docker-compose.yml`. After applying, idle CPU dropped from 325% to ~12% on my machine. ## Test plan - [ ] `pnpm run docker` brings up the stack cleanly - [ ] `docker stats clickhouse` shows low idle CPU - [ ] App functionality unaffected (system log tables are not queried by the webapp)
…mpling (#3567) ## Summary Follow-up to #3561. The drift-audit workflow timed out on PR #3542 (92 files, +5962 lines) by hitting `--max-turns 15` before reaching a verdict, leaving a red ❌ on that PR with no sticky comment. ## Changes - `--max-turns` bumped from 15 to 30. - Prompt now opens with an explicit "Strategy" section: read REVIEW.md once, scan the file-list only, open at most 5 files (3-5 on PRs >50 files), and bias toward finishing over exploring. - Final rule: *"when in doubt between one more file read and finish now — finish now."* The audit is allowed to miss things. It is not allowed to time out and leave a red X. ## Test plan - [ ] Verify this PR's audit posts `✅ REVIEW.md looks current for this PR.` (small diff) - [ ] After merge, retry the audit on #3542 or a similarly large PR and confirm it completes
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
See Commits and Changes for more details.
Created by
pull[bot] (v2.0.0-alpha.4)
Can you help keep this open source service alive? 💖 Please sponsor : )